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1. INTRODUCTION 

Serious acute respiratory syndrome the novel coronavirus disease of 2019 is caused by Coronavirus 2 
(also known as COVID-19) [1]. What first came to exist in Wuhan is in the Chinese province of Hubei [2]. 
And the infection went on a global level from there. The World Health Organization (WHO) classified it an 
epidemic in March 2020 [3]. Till December, Ist, 2020, the number of cases that have been confirmed was 
around 64 million, and 1.5 million were diseased globally [4]. COVID-19 causes tiredness and inflamed lungs 
with high body temperatures, lack of smell and taste with coughing as well. COVID-19 usually spread from an 
infected to a non-infected man through physical contact like touching and coughing [5]. 

In the meantime, the best technique to diagnose the disease is the reverse transcription polymerase 
chain reaction (RT-PCR) [6], yet the mentioned test is still not accurate enough to stop the spread and put a 
definite ending to the pandemic [7]. A test that comes out negative and which is a false result can cause infection 
to spread in large areas for the negative test, but the infected patient was, in reality, infected and unable to get 
correctly diagnosed [8]. To improve the test results radiology images are used [9]. Chest radiographs, as well 
as computed tomography (CT) scans, play a crucial role in identifying the virus through imaging the lungs. 
Certain cases got diagnosed with CT scans before the RT-PCR could help diagnose the infected individual 
[10]. Skilled radiologists are in low supply during the pandemic, making it harder to diagnose potentially 
infected patients in time. Furthermore, because the COVID-19 epidemic is new, It is challenging to gather 
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useful data on chest X-ray (CXR) images [11]. As a result, automated diagnosis technologies are commonly 
demanded [12]. Deep learning (DL) models for automated image processing have the potential to maximize 
the role of radiological images for an accurate and rapid diagnosis of COVID-19 [13]. Because of the huge 
various parameters, convolutional neural networks (CNNs) may readily overfit on small datasets; hence, 
generalization effectiveness is related to the amount of labeled data [14]. CNN uses a hierarchical design to 
automatically extract deep features, which is particularly successful in a variety of visual applications and tasks 
including image denoising and object recognition [15], then classification [16]. Many research projects have 
been carried out intensely and rapidly to create artificial intelligence (AI) techniques for reacting to the COVID- 
19 global outbreak. This review presents some relevant work on COVID-19 classification using a chest CT 
image. 

Goel et al. [17] introduce an optimized convolutional neural network (OptCoNet) on chest X-rays for 
automated detection of COVID-19. The components for feature extraction and categorization make up the 
CNN architecture. The grey wolf optimizer (GWO) method was used to optimize CNN's hyperparameters. The 
data consisted of lung X-rays from healthy and individuals suffering from pneumonia obtained from public 
sources. In total, there were 2700 photos, with 900 of them being COVID-19 photographs. According to the 
writers, the improved CNN model beats state-of-the-art methods. The calculated results were accuracy 97.78%, 
sensitivity 97.75%, specificity 96.25%, precision 92.88%, and Fl-score 95.25%. 

Polsinelli et al. [18] SqueezeNet -based convolutional neural model was presented for the efficient 
classification of COVID-19 computed tomography (CT) pictures from further pneumonia and/or healthy CT 
scans. Original SqueezeNet's performance is outperformed by the planned CNN-2. CNN-2 obtained an 
accuracy of 87.55% and a sensitivity of 81.95% while being 85.01% specific and, 86.2% precise with an F1- 
Score of 86.2%. By using effective pre-processing methods without graphics processing unit (GPU) 
acceleration, the method's performance can be significantly improved. 

Horry et al. [19] on various medical images (x-ray, CT, ultrasound), a deep learning-based COVID- 
19 detection system based on the VGG19 model was constructed, with a precision of up to 86% for X-Ray, 
100% for Ultrasound, and 84% for CT scans. Song et al [20] collected CT-scans of 88 patients with COVID- 
19 and 100 patients with bacterial pneumonia, as well as 86 normal people obtained from two Chin hospitals, 
and utilized these data to develop a deep learning-based CT-diagnosing system for identifying COVID-19 
patients from chest CT -scans. the testing findings revealed that this model could successfully distinguish 
COVID-19 patients from bacterial pneumonia patients. With an AUC of 95%, sensitivity of 96 %, and precision 
of 79%, this study was conducted at the initial stage of the COVID-19 pandemic, so there was a problem of 
shortage of samples to develop such a deep learning prototype. In addition to the effect of the batch, this 
approach performed well on datasets from the source data from the hospital which collects it but it was not 
able of forecasting the external data right away directly. 

Islam and Matin in [21] for identifying COVID-19 in chest CT, a basic convolution neural network 
(CNN) model was employed, followed by a LeNet-5 CNN model. They train and test on a CT data set that 
includes 349 COVID-19 CT scan frames of the lungs and 397 Non-COVID-19 CT scan frames. For COVID- 
19 detection, the obtained results were 86.06 % accurate, and F1 scored 87 %, 85% precise, and recalling ratio 
of 89 %, and an area beneath the curve of ROC of 0.86. Garain et al. [22] CT scans were used to develop a 
three-layer DCSNN for COVID-19 screening. For the potential-based model, the method received an F1 score 
of 99%. The proposed SNN-based model outperforms standard deep learning models on chest CT images, but 
it requires more time to train. The current approach, on the other hand, is more efficient than previous deep 
learning models. Zhang et al. [23] created a new technique (DenseNet-OTLS ) for recognizing COVID-19 
patients from chest CT scans that mixes DenseNet with an optimized transfer learning setting (OTLS) strategy. 
The dataset was made up of 320 images from 142 positive COVID-19 patients and another 320 images from 
142 negative COVID-19 patients. Optimization of the composite learning factor setting (CLFS) and 
optimization of the DenseNet structure are both parts of the OTLS. The proposed OTLS approach has a 
sensitivity of 96.35, a precision of 96.29, a specificity of 96.25, and an accuracy of 96.30. The suggested 
DenseNet-OTLS technique outperformed other methods in diagnosing COVID-19. 

Ouyang et al. [24] for chest CT scans diagnosing COVID-19, a new 3D convolutional network with 
an online attention module is proposed by the authors. They train and validate using multi-center CT data from 
different hospitals, totaling 2186 CT images from a total of 1588 different patients. A comparable dataset of 
2796 CT images from 2057 sick individuals was utilized for the testing stage. by the researcher 3D ResNet34 
architecture with an attention module is offered. Due to the lack of class balance in the training data, two 
models were used, one with uniform sampling and one with a size-balanced sample. The predictions from the 
two models are then integrated by transfer learning approach usage. accuracy, AUC, specificity, sensitivity,and 
F1 score were reported as 86.9 %, 0.944, 87.5 %, 90.1 %, and 82.0 %, respectively. 

Wang et al. [25] have suggested a novel collaborative learning framework for reliably identifying 
COVID-19 from a variety of data sets with disparities in distribution. A network was split into two sections: 
one with a lightweight architecture and four distinct convolution layers, and the other with the learning blocks 
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of denser connections. The SARS-CoV-2 and COVID-CT dataset were utilized in this study to evaluate a 
combined learning system. The dataset included 2,482 CT scans from 120 sick individuals, 1,230 of whom 
were non-COVID but complaining of various symptoms of lung infectious diseases and 1252 of whom were 
COVID-19. The COVID CT dataset included clinical findings from 397 CT scans from 171 persons with no 
COVID-19 features and 349 CT pictures from 216 persons with COVID-19. Both datasets have been computed 
for evaluation metrics. The SARS-CoV-2 data set outperformed the COVID-CT dataset regarding the 
outcomes. The whole accuracy was 90%, with 85% recall, 95% precision, and 90% FI score. 

Hasan et al. [26] predict COVID-19 patients from CT images using DenseNet-121-based CNN. The 
results were 92% accurate and had a 95% recall rate, demonstrating good performance for COVID-19 
prediction. Pham [27] presented the findings of COVID-19 classification research with sixteen CNNs that had 
been pre-trained. In this study, data augmentation is replaced by transfer learning resulting in higher 
classification rates in this experiment. The dataset was randomly split between training and testing data (80% 
and 20%, respectively), with MobileNet-v2 achieving the maximum achieved an accuracy of 95%, ResNet-18 
achieving the highest sensitivity of 98%, DenseNet-201 achieving the highest specificity of 96 %, and 
MobileNet-v2 achieving the highest Fl-score of 96%. MobileNet-v2, ShuffleNet, ResNet-18, and DenseNet- 
201 were all distinct networks. 

Pathak et al. [28] deep transfer learning (DTL) was utilized to create a classification model for a 
COVID-19-infected patient. The data set included 413 pictures of COVID-19-infected persons and 439 images 
of healthy people or non-COVID19 pneumonia. The dataset's training and testing ratios were set at 60% and 
40%, respectively. The suggested model achieves up to 96.2264% training accuracy and 93.0189% testing 
accuracy. which concludes that the already present COVID-19 test equipment can easily be replaced by this 
model. Zhang et al. [29] developed an AI technique using a large computed tomography (CT) database of 
3,777 patients to identify COVID-19 pneumonia and distinguish it from healthy controls and other forms of 
pneumonia using the convolutional neural network ResNet-18 model, the authors explored the relevance of 
detecting important clinical indicators. For COVID-19, their suggested approach obtained a Sensitivity of 
94.93%, Specificity of 91.13%, AUC of 0.981, Sensitivity of 94.93%, AUC of 0.981, and Accuracy of 92.49%. 

Perumal et al. [30] suggest increasing the dataset quality and quantity to implement the deep learning 
method on CT images and chest radiographs. The significance of categorizing radiological pictures at an initial 
stage of the disease is demonstrated. Deep transfer learning and self-supervised techniques are proposed to 
avoid the costly manual labeling of huge data samples. Deep transfer learning approaches for both CT images 
and CXRs were investigated for numerous lung infections in conjunction with SARSCOV. The suggested 
model surpasses other current models by producing a recall of 90%, precision of 91%, and accuracy of 93% 
utilizing VGG-16 and transfer learning. 

Wang ef al. [31] developed a new multi-task prior-attention learning method for implementing 
COVID-19 screening in 3D images from chest CT scans. They got CT scans from 4657 people, with 936 
normal scans, 2406 scans with virus-induced interstitial lung disease, and 1315 scans with COVID-19. The 
PARL block was designed as a single model framework for end-to-end training by bringing double ResNet- 
based branches together. The experimental findings showed that this approach outperformed other cutting-edge 
COVID-19 screening methods with accuracy of 93.3%, specificity of 95.5%, and sensitivity of 87.6%. 

Loey et al. [32] Researchers employed a mix of classical data augmentations and CGAN with deep 
transfer learning to detect COVID-19 in limited lung CT scan pictures. To detect individuals infected with 
SARSCOV?2 using chest tomography, researchers used a dataset of 742 CT scan photos and five distinct CNN- 
based models (GoogleNet, ResNet50, AlexNet, VGGNet19, and VGGNet16). In all tested deep transfer 
models, classical data augmentations combined with CGAN improve classification outcomes. With a testing 
accuracy of 82.91%, a sensitivity of 77.66%, and a specificity of 87.62%, the results show that ResNet50 is the 
best model for diagnosing SARSCOV2 from a constrained dataset using traditional data augmentation. 

Wang et al. [33] According to the hypothesis of this work, artificial intelligence algorithms may be 
able to generate specific graphical characteristics of COVID-19. To develop the approach, they used a dataset 
of 1065 CT scans of cases with COVID-19 and without COVID-19 pneumonia to update the original transfer- 
learning model, which was then internally and externally validated. Internal validation demonstrated 89.5% 
accuracy, 87% sensitivity, and 88% specificity respectively. The external testing dataset demonstrated a total 
accuracy of 79.3 %, a sensitivity of 67%, and a specificity of 83% Furthermore, the first two nucleic acid test 
results in 54 COVID-19 images were negative, and the algorithm correctly forecasted 46 as COVID-19 positive 
with an accuracy of 85.2%. 

The evaluation matrices in Table 1 (in appendix) show the results of different CNN model 
architectures in terms of COVID-19 test results accuracy, as well as the differences between the results of each 
author, which researchers had reviewed about. 
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2. METHODS 
2.1. Convolutional neural network (CNN) 

CNN is a particular kind of neural network [34]. With a unique topology that was inspired by 
biological research [35]. In 1998 Fukushima first introduced it, and they have a broad range of applications in 
activity identification, phrase classification, biometrics and text recognizing, detecting and localizing objects, 
scanned results’ analysis, and so on. They are composed of neurons, each of which has a weight that can be 
learned ait sownown favoring. The network contains a single input and single output layer, as well as other 
hidden layers, the latter of which includes layers of convolution, pooling, a fully connected (FC), and numerous 
normalizing layers [36]. 


2.2. Explanation of the structure of every CNN layer 
2.2.1. Convolutional layer 

The Convolution Layer is the simplest basic but also the most significant in a CNN. It essentially goes 
by convolving or multiplying the matrix of pixels created for the provided image or object to build a map of 
activation for the current image. The fundamental advantage of activation maps is that they store all of the 
differentiating qualities of a particular image thus reducing the amount of data that must be processed. The data 
is convolved with the help of a matrix that serves as a feature detector. The feature detector is, in essence, a 
collection of values that are machine-compatible. By altering the threshold value of the feature detector one 
can get a wide variety of picture permutations. To train the convoluted model and achieve the most accurate 
results possible in each layer, backpropagation is also utilized. Depth and padding are determined by the lowest 
error set [37]. 


2.2.2. Pooling 

Pooling is a key phase in lowering the dimensions of the map of activation, retaining just the necessary 
elements while minimizing spatial invariance. As a result, the model's learnable features are reduced [38]. This 
contributes to the resolution of the overfitting issue. Pooling enables CNN to absorb all of the distinct 
resolutions and sides of an image, allowing it to effectively detect the provided item even if its form is distorted 
or at a different angle. Pooling may be classified into several categories, including maximum pooling, average 
pooling, stochastic pooling, and spatial pyramid pooling [37]. The most prominent of them is max pooling 
[39]. Max pooling divides the image into several parts as rectangular sub-regions and just gives back the highest 
value from within that sub-region [40]. 2x2 is one most commonly used sizes in max-pooling [41]. The 
operation of Max pooling is shown in Figure 1. 


Max-pooling 


— 6 | 


Figure 1. Operation of max-pooling [40] 


2.2.3. Fully connected layer 

This is the final layer that is sent to the neural network (the final layer is a classifier that outputs the 
identification result [42]. In general, matrices are made flat before being passed on to neurons. It is difficult to 
keep up with the data after the current point owing to the inclusion of a large number of hidden layers with 
varied weights for each neuron's output. All data reasoning and computation are done here. 


2.3. CNN 1 

The CNN architecture has two main parts. The convolution for feature extraction and a fully connected 
layer that utilizes the image class. The most popular CNN architecture [43], [44]. Include LeNet [41], 
GoogleNet [45], ResNet, VGG-16, AlexNet [46], MobileNetV2 and DenseNet. 


2.3.1. LeNet 

LeNet is one of the first CNN, and it has been mostly used for recognizing and distinguishing digits. 
Yan LeCun published this design in 1998, and it is based on the MNIST database. This network's main design 
contains convolution of a (5x5) size on the input, followed by an average pooling of (2x2) with a stride of a 
twice repeated 2, and is eventually terminated with two layers fully connected. The last input to the FCN has 
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the dimensions 120x1x1. The number of parameters considered is around 60,000 [41], [47]. The details are 
illustrated in Figure 2. 


Cl: feature maps 
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Gaussian connection 


| Full connection 
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Figure 2. LeNet architecture [41] 


2.3.2. VGG-16 

VGG is an abbreviation for Visual Geometric Group [48]. Which was created by VGG at the 
University of Oxford [36]. This network debuted at the ILSVRC 2014 competition [49]. Where it was a runner- 
up but was well-recognized and accepted. This network has around 138 million parameters [50]. There are 16 
convolutional layers in total [51]. In addition, there are two completely linked layers with 4096 hidden levels 
each. The VGG-16 design is shown in Figure 3. 


convolution+ReLU 
= max pooling 

om] fully connected+ReLU 
p softmax 


Figure 3. The basic architecture of VGG-16 [49] 


2.3.3. Google Net 

In 2014 GoogLeNet achieved first place in the ImageNet competition (ILSVRC) [52]. Christian 
Szegedy introduced the network structure idea. The structure is based on LeNet and AlexNet framework 
structures, with depth and width adjustments network. The architecture includes 22 network layers. It employs 
a novel parallel structure, which drastically shortens the training cycle. The VGGNet and GoogLeNet have 
pushed the research boom of deep learning to the peak [53]. 


2.3.4. DenseNet 

DenseNet, a logical expansion of ResNet, improves performance by recombining each layer feature 
map with the preceding layer within a dense block. This allows layers that follow afterward of the network to 
directly exploit previous levels' features, promoting feature reuse throughout the network [51]. All previous 
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layers' feature maps would be used as inputs for each layer, and their feature maps would be used as inputs into 


all subsequent layers, which helps to ease the vanishing gradient problem, feature reuse, and minimize the 
number of parameters [54]. Shown in Figure 4. 


Output 


Figure 4. The architecture of dense blocks in DenseNet [55] 


2.3.5. Microsoft ResNet 

ResNet, or deep residual network, won the ImageNet competition in 2015, outperforming the accuracy 
of humans for the first time with a rate of error of roughly 3.6%. It is a 152-layer model that uses a single 
remarkable model to provide advanced traces in classification, localization, and detection. Residual block: It 
overcomes the difficulty of deep model training by building identity skip linkages between layers, allowing 
inputs to be copied to the next layer. The goal of this method is for the next layer to learn something new and 
unique from the input of the preceding layer [36]. 


3. DISCUSSION 

In the previous section, all that is discussed in the paper is work from 2 years ago related to the studies 
from various types of CNN models for COVID-19 diagnosis and severity detection from chest CT images. The 
authors in the [19], [30], and [32] applied VGG models only or with another model or with transfer learning to 
a variety of datasets to identify COVID-19 from CT-Scan samples. The results came out better with 93% 
accuracy and 91% precision [30]. Zhang et al. [23] and Hasan et al. [26] DenseNet was used, the accuracy 
resulted in 92% while the sensitivity resulted in 95% a result in [26], while in [23] the COVID-19 diagnosis 
results are higher. The researchers in [24], [29], and, [31] only used the ResNet model or some other models 
or methods for different datasets which resulted in a 93.3% accuracy in [31] which is higher than the results 
from [24] and [29]. The favored method from [17] out of the other method results because it resulted in the 
highest accuracy out of all the studies that we had reviewed which was an accuracy of 97.78% and high 
sensitivity of 97.75%. Goel et al. [17] used the proposed CNN to extract feature and classification components 
with the grey wolf optimizer to optimize the CNN hyperparameters, for this reason, it is better than other 
methods. 


4. CONCLUSION 

This current review gives a comprehensive summary of state-of-the deep learning uses for combating 
the COVID-19 pandemic, to identify and detect COVID-19 disease in its early stages using chest HRCT 
images. The CNN and its modified models were identified to be the most commonly employed for COVID-19 
pandemic prediction. The included studies have shown that DL approaches have a considerable influence on 
early COVID-19 identification with a high accuracy rate. However, the majority of the proposed ways and 
methods are still in the early stages of development, necessitating extra substantial research. 
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APPENDIX 
Table 1. COVID-19 CT-Scan images detection based on CNN modles 
Ref Dataset Methods Results 
[17] COVID-19: 900 CT images optimized convolutional neural network Acc. 97.78% 
Non-COVID:1800 CT images (OptCoNet) Prec. 92.88% 
Sens. 97.75% 
Spec. 96.25% 
Fl-score. 95.25%. 
[18] Zhao et al. dataset: COVID-19 360 CT scans SqueezeNet's based-CNN Acc. 85.03 % 
Non-COVID: 397 CT scans. Prec. 85.01 % 
Italian dataset: 100 CT scans of COVID-19 Sens. 87.55 % 
Spec. 81.95 % 
Fl-score. 86.20 % 
[19] COVID-19: 140 X-ray images VGG19 Prec. 86% 
Non-COVID:60683 X-ray images Fl-score.87% 
(X-ray) 
COVID-19: 399 Ultrasound images Prec. 100% 
Non-COVID:512 Ultrasound images Fl-score. 99% 
(Ultrasound) 
COVID-19: 349 CT images Prec. 84% 
Non-COVID:397 CT images Fl-score. 78% 
(CT) 
[20] COVID-19: 777 CT images Pre-trained ResNet50 - based DRENet Prec. 79% 
Non-COVID:1213 CT images Sens. 96% 
AUC. 95% 
[21] COVID-19: 349 CT images basic CNN model + LeNet-5 CNN Acc. 86.06 % 
Non-COVID:397 CT images model Prec. 85 % 
Sens. 89 % 
Fl-score. 87 % 
AUC. 86 % 
[22] COVID-19: 349 CT scan images Three-layer DCSNN FI score. 99 % 
Non- COVID- 19: 397 CT scan images 
[23] COVID-19: 320 CT images DenseNet-OTLS Acc. 96.30 % 
Non-COVID: 320 CT images Prec. 96.29 % 
Sens. 96.35 % 
Spec. 96.25 % 
[24] Train and validate dataset 2186 CT images 3D ResNet34 + an online attention Acc. 87.5% 
Test dataset 2796 CT images module Sens. 86.9% 
Spec. 90.1% 
FI score. 82.0% 
AUC. 94.4% 
[25] Dataset 1 COVID-19: 1252 CT images collaborative learning framework Acc. 90.83% 
Non-COVID:1230 CT images Prec. 95.75% 
Sens. 85.89% 
FI score. 90.87% 
AUC. 96.24% 
Dataset 2 COVID-19: 349 CT images 
Non-COVID: 397 CT images Acc. 78.69% 
Prec. 78.02%A 
Sens. 79.71% 
F1 score. 78.83% 
AUC. 85.32% 
COVID-19: 1252 CT images DenseNet-121 based-convolutional Acc. 92 % 
[26] Non-COVID: 1230 CT images neural networks (CNN) Sens. 95 % 
[27] COVID-19: 349 CT images Sixteen pre-trained CNNs Acc. 95 % 
Non-OVID: 397 CT images (MobileNet-v2 ) 
Sens. 98% 
(ResNet-18) 
Spec. 96 % 
(DenseNet-201) 
Fl-score 96 % 
(MobileNet-v2) 
[28] COVID-19: 413 CT images Deep Transfer Learning (DTL) Acc. 96.2264 % 


Non-COVID:439 CT images 


(Training accuracy) 
Acc. 93.0189 % 
(testing accuracy). 
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Table 1. COVID-19 CT-Scan images detection based on CNN modles (continue) 


Ref Dataset Methods Results 


29 


30 


31 


32 


33 


Total data: 617,775 CT images CNN ResNet-18 model Acc. 92.49 % 
Sens. 94.93 % 
Spec. 91.13 % 
AUC. 98.1 % 
Large chest x-ray and CT images dataset VGG-16 +transfer learning. Acc. 93% 
prec. 91% 
Sens. 90% 
COVID-19: 1315 CT images 3D-ResNets + prior-attention Acc. 93.3% 
Non-COVID: 3342CT images mechanism Sens. 87.6% 
Spec. 95.5% 
Total data: 742 CT-scan images CGAN + five different deep CNN-based Testing 
models (AlexNet, VGGNet16, Acc. 82.91% 
VGGNet19, GoogleNet, and ResNet50) Sens. 77.66% 
Spec. 87.62% 


Total data: 1065 CT images Modified inception transfer-learning Internal validation 
model Acc. 89.5%, 
Spec. 88% 
Sens. 87%. 
The external testing 
Acc. 79.3 % 
spec. 83% 
Sens. 67% 
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